課程資訊
課程名稱
數據分析與流形學習
Data Analysis and Manifold Learning 
開課學期
112-1 
授課對象
電機資訊學院  資料科學碩士學位學程  
授課教師
林澤佑 
課號
Data5008 
課程識別碼
946 U0080 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期三3,4,5(10:20~13:10) 
上課地點
綜402 
備註
大學部學生欲加選者,請與老師聯繫並出席第一週課程。
限碩士班以上
總人數上限:30人 
 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

流形學習是一類用於識別高維度資料的技術。其基本假設是資料來源於高維度的空間低維度子流形上或是在其附近。流形學習的目的在於探索該種低維度結構並使數據用相對簡單、更具解釋性的形式來表達。常見的的流形學習包含主成分分析、多維尺度法、ISOMAP、t-隨機鄰近嵌入法等等。這些方法主要的差異為對於數據的假設、適用的數據種類與將數據表示為低維度的計算複雜度。

在本門課程中,我們將介紹許多流形學習技術中的數學概念、演算法以及這些方法的基本假設和局限性。在學期結束時,學生需要完成一個應用流形學習技術的期末專案。該專案應該包含一種或多種流形學習方法處理並應用於真實世界數據的分析上,並將結果清晰易懂的方式呈現出來。

Manifold learning is a class of techniques used to identify patterns in high-dimensional data. The underlying assumption is that the data lies on or near a lower-dimensional manifold embedded in the high-dimensional space. The goal of manifold learning is to discover this lower-dimensional structure and represent the data in a simpler, more interpretable form. Common manifold learning techniques include principal component analysis (PCA), multidimensional scaling (MDS), ISOMAP, t-SNE, and many others. These methods differ in their assumptions, the types of data they are suitable for, and the complexity of the resulting representation.

In this course, we will introduce the mathematical concepts and algorithms used in many manifold learning techniques, along with their underlying assumptions and limitations. At the end of this semester, students are required to complete a final project that utilizes manifold learning techniques. The project should involve preparing and analyzing real-world data using one or more manifold learning methods, and, presenting the results in a clear and informative manner. 

課程目標
Manifold learning is a branch of machine learning that focuses on nonlinear dimensionality reduction, which is often applied for data preprocessing in the field of data science. The goal of dimensionality reduction is to represent data of interest in a low-dimensional space, preserving important information and relationships by mapping the data from a high-dimensional space to a lower-dimensional space.

In this course, we will provide an overview of some popular manifold learning techniques and demonstrate their implementation in Python. Both theoretical and practical aspects of each technique will be covered in the discussions.

第一週 Preliminary Examination; Basic graph theory and matrices associated with a graph [4]
第二週 Special Matrices and Matrix Eigenvalue Problems
第三週 Introduction to spectral and graph-based methods
第四週 PCA and SVD [8]
第五週 Exam I; Fisher Linear Discriminant [2]
第六週 No class
第七週 Laplacian embedding and spectral clustering [1]
第八週 Multidimensional scaling [3]; Locally Linear Embedding [11]
第九週 ISOMAP [12]
第十週 Exam II; Introduction to kernel method
第十一週 Kernel PCA [10]
第十二週 Diffusion kernels [5]
第十三週 Introduction to manifold reconstruction
第十四週 Local-surface-fitting based methods [6], [7]
第十五週 Final project
第十六週 Final project 
課程要求
Note that this is a graduate-level course and math maturity is essential for success in this course.

Prerequisite: Calculus, Probability, Linear Algebra, Elementary Differential Geometry, Python
Homework: 50%
Exam I: 15%
Exam II: 15%
Project: 20% 
預期每週課後學習時數
 
Office Hours
 
指定閱讀
 
參考書目
The lecture will be based on the slides partially from the following reference:
[1] Belkin, Mikhail, and Partha Niyogi. "Laplacian eigenmaps for dimensionality reduction and
data representation."; Neural Computation 15.6 (2003): 1373-1396.
[2] Bishop, Christopher M., and Nasser M. Nasrabadi. Pattern recognition and machine learning.
Vol. 4. No. 4. New York: Springer, 2006.
[3] Borg, Ingwer, and Patrick JF Groenen. Modern multidimensional scaling: Theory and
applications. Springer Science & Business Media, 2005.
[4] Chung, Fan RK. Spectral graph theory. Vol. 92. American Mathematical Soc., 1997.
[5] De la Porte, J., et al. "An introduction to diffusion maps." Proceedings of the 19th Symposium
of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South
Africa. 2008.
[6] Faigenbaum-Golovin, Shira, and David Levin. "Manifold reconstruction and denoising from
scattered data in high dimension." Journal of Computational and Applied Mathematics (2022):
114818.
[7] Fefferman, Charles, et al. "Reconstruction and interpolation of manifolds. I: The geometric
Whitney problem." Foundations of Computational Mathematics 20.5 (2020): 1035-1133.
[8] Jolliffe, Ian. "Principal component analysis." Encyclopedia of statistics in behavioral science
(2005).
[9] Ma, Yunqian, and Yun Fu. Manifold learning theory and applications. Vol. 434. Boca Raton,
FL: CRC press, 2012.
[10] Mika, Sebastian, et al. "Kernel PCA and de-noising in feature spaces." Advances in neural
information processing systems 11 (1998).
[11] Saul, Lawrence K., and Sam T. Roweis. "An introduction to locally linear embedding."
unpublished. Available at: http://www. cs. toronto. edu/~ roweis/lle/publications. html (2000).
[12] Tenenbaum, Joshua B., Vin De Silva, and John C. Langford. "A global geometric framework
for nonlinear dimensionality reduction." science 290.5500 (2000): 2319-2323. 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
無資料